Isadora | a Speech Modelling Network Based on Hidden Markov Models

نویسندگان

  • E G Schukat-Talamazzini
  • H Niemann
چکیده

In this paper we present the ISADORA system which provides highly exible speech recognition based on HMM technology together with an hierarchical representation of speech units. Markov model topologies, subword unit inventories, regular grammars expressed in nite-state or phrase structure style, and even the analysis tasks themselves are explicitly represented by the nodes of a large speech unit network. Thus, nothing that can be \said in the language of Markov models" needs to be hard-wired in the program code. In contrast to traditional compiled network recognizers, units, grammars, and tasks may be created or modiied at analysis time, and the outcome of the decoding process is a structured symbolic description of the sensory input. Our architecture has proven extremely useful in prototyping new kinds of subword units. Besides generalized triphones and context-freezing units, a new subword speech unit for automatic speech recognition has been implemented. The so-called polyphones are phone-like units which generalize the well-known concept of triphone units in that more than one left or right context symbol is allowed. Moreover, context items may be of segmental or suprasegmental nature. Moreover, a powerful new training paradigm based on the propagation of statistical parameters through the speech unit network has been introduced. The propagation-based Baum-Welch training algorithm is capable of fast and robust estimation of very large parameter sets | the real-time factor for training is very low (0.3) and independent of utterance duration and model complexity. The paper closes with the presentation of performance gures for numerous continuous speech recognition experiments. Choosing a suitable inventory of polyphones as subword units, a 162-word (or 1081-word, resp.) vocabulary, and using no grammar, speaker-dependent training yielded a word accuracy of 98.3 % (92.4 %). In the speaker-independent mode, accuracies of 91.8 % (84.5 %) have be achieved. This performance is among the best ones reported so far for speaker-independent large-vocabulary continuous speech recognition.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

Speech enhancement based on hidden Markov model using sparse code shrinkage

This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...

متن کامل

SVR vs MLP for Phone Duration Modelling in HMM-based Speech Synthesis

In this paper we investigate external phone duration models (PDMs) for improving the quality of synthetic speech in hidden Markov model (HMM)-based speech synthesis. Support Vector Regression (SVR) and Multilayer Perceptron (MLP) were used for this task. SVR and MLP PDMs were compared with the explicit duration modelling of hidden semi-Markov models (HSMMs). Experiments done on an American Engl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993